-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow parsing Content-Type and Accept headers with version #61427
Allow parsing Content-Type and Accept headers with version #61427
Conversation
Content-Type and Accept headers expect a verioned form of media types like application/vnd.elasticsearch+json;compatible-with=7 when previously it was simple application/json or similar - it is still supported
Pinging @elastic/es-core-infra (:Core/Infra/REST API) |
server/src/test/java/org/elasticsearch/common/xcontent/XContentTypeTests.java
Outdated
Show resolved
Hide resolved
@@ -114,13 +116,19 @@ public XContent xContent() { | |||
} | |||
}; | |||
|
|||
private static final Pattern COMPATIBLE_API_HEADER_PATTERN = Pattern.compile( | |||
"(application|text)/(vnd.elasticsearch\\+)?([^;]+)(\\s*;\\s*compatible-with=(\\d+))?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regex needs a few more escapes:
- Escape the
.
withinvnd.elasticsearch+
Escape the/
between the type and subtype
(application|text)\\/(vnd\\.elasticsearch\\+)?([^;]+)(\\s*;\\s*compatible-with=(\\d+))?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I am curious about is the (\\d+))?
at the end. If possible, it would be nice to define this as part of our "spec". This would be something we could be strict (eschew leniency!) about, such as enforcing it to be in a specific format (and throwing an exception when someone specifies a pattern that is not correct. We can also put the format into the documentation.
If you agree with the sentiment, I would be curious what we expect the final number to be, for example, which of these do we intend to support (now and in the future):
- compatible-with=7
- compatible-with=7.11
- compatible-with=7.x
- compatible-with=7x
- compatible-with=7.11.2
Then I think we should make this strict so users find out early if they've send an illegal value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sethmlarson Why does the forward slash needs escaping? It's not special in Java?
I wondering if we should anchor the pattern with ^
at the start and $
at the end, to be on the safe side?
In ([^;]+)(\\s*;
, the [^;]+
will also consume whitespace, so I believe you can remove the \\s*
.
Can we switch any of these capture groups to non-capturing if we don't actually need the contents, just the grouping? i.e (?:<pattern>)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pugnascotia You're right the /
doesn't need an escape in Java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sethmlarson fixed the \\.
@dakrone Only major versions are supported, hence only digits are allowed - representing the major part of the version #51816 (no dots or any other character, preventing parsing minor version)
Good point on validation - how much should be done, how precise the exception should be.. @jakelandis or @jaymode any views on this?
@pugnascotia good point on consuming a space. The subtype (json, yaml, etc) are not allowing spaces. Added the anchoring and skipped capturing when possible. I think all of the groups require capturing though (except maybe for the optional ; charset=UTF-8
)
@@ -114,13 +116,19 @@ public XContent xContent() { | |||
} | |||
}; | |||
|
|||
private static final Pattern COMPATIBLE_API_HEADER_PATTERN = Pattern.compile( | |||
"(application|text)/(vnd.elasticsearch\\+)?([^;]+)(\\s*;\\s*compatible-with=(\\d+))?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(vnd.elasticsearch\+)?
Am I reading this correctly that the vnd.elasticsearch
part is completely optional? I thought the intent was to only add compatible-with
when vnd.elasticsearch
is present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right. This is possibly a bit oversimplified.
Do you think we should aim to be more strict on the regex side or should we validate the values after the parsing with more relaxed regex?
The more strict could look like
Pattern.compile(
"(application|text)/((vnd\\.elasticsearch\\+(json|smile|yaml|cbor)(\\s*;\\s*compatible-with=(\\d+)))" +
"|(json|smile|yaml|cbor))");
there is a simplistic validation in parseVerion
method, but it still allows to add compatible-with
even when subtype vnd.elasticsearch
is not present. The version won't be parsed though
I have reworked the regex. I tried not to be overly strict as there has to be some flexibility to allow additional subtypes defined in SQL plugin (see |
I think that if we want to make the I made an attempt here, but I have some SQL test failing now. #61845 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some questions and I also wonder if we really need to use a regex for this parsing? Also, compatible-with
is a vendor specific parameter if I understand correctly but I am not sure that there should be a requirement that this parameter comes first vs charset being sent in the header first?
libs/x-content/src/main/java/org/elasticsearch/common/xcontent/XContentType.java
Outdated
Show resolved
Hide resolved
libs/x-content/src/main/java/org/elasticsearch/common/xcontent/XContentType.java
Outdated
Show resolved
Hide resolved
libs/x-content/src/main/java/org/elasticsearch/common/xcontent/XContentType.java
Outdated
Show resolved
Hide resolved
libs/x-content/src/main/java/org/elasticsearch/common/xcontent/XContentType.java
Outdated
Show resolved
Hide resolved
Do you think something as simple as (
You are right, I think there is no need for |
I was thinking about the possibility of having a new class for doing the parsing to move it away from public class ContentTypeParser {
public ContentTypeParser(Set<String> acceptedMimeTypes) { // plugins should be able to provide these so we don't break sql. maybe need more than a string?
}
public ContentType parse(String contentTypeHeader) {
// split string on semicolon
// validate media type is accepted (IIRC whitespace is ok so trim it)
// rest of strings are params. validate per RFC 7230 and use ones that we care about
// or use a regex and we can change if necessary
}
public ContentType fromAcceptHeader(String acceptHeader) { // not necessary for this PR but I guess it is for SQL
}
}
public class ContentType {
public String mediaType() {}
public Version compatibleWith() {} // or return the int version
public Charset charset() {} // not necessary for this
} Then |
libs/x-content/src/main/java/org/elasticsearch/common/xcontent/XContentType.java
Outdated
Show resolved
Hide resolved
Splitting method XContentType.fromMediaTypeOrFormat into two separate methods. This will help to validate media type provided in Accept or Content-Type headers. Extract parsing logic from XContentType (fromMediaType and fromFormat methods) to a separate MediaTypeParser class. This will help reuse the same parsing logic for XContentType and TextFormat (used in sql) `Media-Types type/subtype; parameters` parsing is in defined https://tools.ietf.org/html/rfc7231#section-3.1.1.1 part of #61427
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good. A few minor comments. However, could we get someone from @elastic/es-ql to take a look a the the changes for SQL ?
libs/x-content/src/main/java/org/elasticsearch/common/xcontent/MediaTypeParser.java
Outdated
Show resolved
Hide resolved
libs/x-content/src/main/java/org/elasticsearch/common/xcontent/XContentType.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/common/xcontent/XContentTypeTests.java
Show resolved
Hide resolved
libs/x-content/src/main/java/org/elasticsearch/common/xcontent/MediaTypeParser.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/common/xcontent/XContentTypeTests.java
Show resolved
Hide resolved
Pinging @elastic/es-ql (:Query Languages/SQL) |
I updated the label to be only for 8.0 and >breaking For clarity, the breaking here is that with this change we will only support requests for charset=UTF-8. Prior to this change we accepted any charset requested (or any random paramater)...but I think always encoded in UTF-8 and ignored the parameters (except for SQL). So it is non-passive in the fact we are more strict about what we accept. With this PR we will not throw an exception if we don't support it (to avoid scope creep), however if the charset (or any parameters) are not supported it will result in the default JSON return type. "Accept: application/yaml;charset=junk" results in application/json. A follow up PR will throw the necessary exception to result in the more correct 40x error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending someone from @elastic/es-ql validating the SQL part(s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
x-pack/plugin/sql/src/main/java/org/elasticsearch/xpack/sql/plugin/RestSqlQueryAction.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/sql/src/main/java/org/elasticsearch/xpack/sql/plugin/RestSqlQueryAction.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Jay Modi <[email protected]>
7.x client can pass media type with a version which will return a 7.x version of the api in ES 8. In ES server 7 this media type shoulld be accepted but it serve the same version of the API (7x) relates #61427
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SQL bits look good, I've left only some minor comments.
Map.of("header", "present|absent", "charset", "utf-8")) | ||
.withMediaTypeAndParams(TextFormat.CSV.typeWithSubtype(), TextFormat.CSV, | ||
Map.of("header", "present|absent", "charset", "utf-8", | ||
"delimiter", "[^\"\n\r\t]+")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the delimiter regex derived from TextFormat.CSV#delimiter()? Because it specifies what characters are allowed, inline with what that method checks, on one hand, but allows a count of multiple character, out of line with a check there, on the other.
The exception messages in the method are explanatory and I'd ever so slightly prefer to return those (and maybe don't enforce a specific set here), but if you'd still like to enforce the param format in the parser definition, I'd suggest to maybe also limit the char count to one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, I misread that method. It should only be one character.
The exception messages are indeed explanatory in this method. With the parameter validation when parsing, we will loose them - that code will not be reachable as it will fail earlier when parsing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to expect the presence of a delimiter
parameter, but the pattern will be .+
which will always pass, and the validation will be done in TextFormat.CSV#delimiter()
x-pack/plugin/sql/src/main/java/org/elasticsearch/xpack/sql/plugin/SqlMediaTypeParser.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/sql/src/main/java/org/elasticsearch/xpack/sql/plugin/SqlMediaTypeParser.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/sql/src/main/java/org/elasticsearch/xpack/sql/plugin/SqlMediaTypeParser.java
Show resolved
Hide resolved
x-pack/plugin/sql/src/test/java/org/elasticsearch/xpack/sql/plugin/SqlMediaTypeParserTests.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Bogdan Pintea <[email protected]>
…lastic#61427)" This reverts commit d04edcd
Content-Type and Accept headers expect a versioned form of media types
like application/vnd.elasticsearch+json;compatible-with=7
when previously it was simple application/json or (cbor, yaml..) - it is still
supported
next step after #61987
relates #60516